8 research outputs found
From Constraints to Opportunities: Efficient Object Detection Learning for Humanoid Robots
Reliable perception and efficient adaptation to novel conditions are priority skills for robots that function in ever-changing environments. Indeed, autonomously operating in real world scenarios raises the need of identifying different context\u2019s states and act accordingly. Moreover, the requested tasks might not be known a-priori, requiring the system to update on-line. Robotic platforms allow to gather various types of perceptual information due to the multiple sensory modalities they are provided with. Nonetheless, latest results in computer vision motivate a particular interest in visual perception. Specifically, in this thesis, I mainly focused on the object detection task since it can be at the basis of more sophisticated capabilities.
The vast advancements in latest computer vision research, brought by deep learning methods, are appealing in a robotic setting. However, their adoption in applied domains is not straightforward since adapting them to new tasks is strongly demanding in terms of annotated data, optimization time and computational resources. These requirements do not generally meet current robotics constraints. Nevertheless, robotic platforms and especially
humanoids present opportunities that can be exploited. The sensors they are provided with represent precious sources of additional information. Moreover, their embodiment in the workspace and their motion capabilities allow for a natural interaction with the environment.
Motivated by these considerations, in this Ph.D project, I mainly aimed at devising and developing solutions able to integrate the worlds of computer vision and robotics, by focusing on the task of object detection. Specifically, I dedicated a large amount of effort in alleviating state-of-the-art methods requirements in terms of annotated data and training time, preserving their accuracy by exploiting robotics opportunity
A Grasp Pose is All You Need: Learning Multi-fingered Grasping with Deep Reinforcement Learning from Vision and Touch
Multi-fingered robotic hands could enable robots to perform sophisticated
manipulation tasks. However, teaching a robot to grasp objects with an
anthropomorphic hand is an arduous problem due to the high dimensionality of
state and action spaces. Deep Reinforcement Learning (DRL) offers techniques to
design control policies for this kind of problems without explicit environment
or hand modeling. However, training these policies with state-of-the-art
model-free algorithms is greatly challenging for multi-fingered hands. The main
problem is that an efficient exploration of the environment is not possible for
such high-dimensional problems, thus causing issues in the initial phases of
policy optimization. One possibility to address this is to rely on off-line
task demonstrations. However, oftentimes this is incredibly demanding in terms
of time and computational resources. In this work, we overcome these
requirements and propose the A Grasp Pose is All You Need (G-PAYN) method for
the anthropomorphic hand of the iCub humanoid. We develop an approach to
automatically collect task demonstrations to initialize the training of the
policy. The proposed grasping pipeline starts from a grasp pose generated by an
external algorithm, used to initiate the movement. Then a control policy
(previously trained with the proposed G-PAYN) is used to reach and grab the
object. We deployed the iCub into the MuJoCo simulator and use it to test our
approach with objects from the YCB-Video dataset. The results show that G-PAYN
outperforms current DRL techniques in the considered setting, in terms of
success rate and execution time with respect to the baselines. The code to
reproduce the experiments will be released upon acceptance.Comment: Submitted to IROS 202
Grasp Pre-shape Selection by Synthetic Training: Eye-in-hand Shared Control on the Hannes Prosthesis
We consider the task of object grasping with a prosthetic hand capable of multiple grasp types. In this setting, communicating the intended grasp type often requires a high user cognitive load which can be reduced adopting shared autonomy frameworks. Among these, so-called eye-in-hand systems automatically control the hand pre-shaping before the grasp, based on visual input coming from a camera on the wrist. In this paper, we present an eye-in-hand learning-based approach for hand pre-shape classification from RGB sequences.
Differently from previous work, we design the system to support the possibility to grasp each considered object part with a different grasp type. In order to overcome the lack of data of this kind and reduce the need for tedious data collection sessions for training the system, we devise a pipeline for rendering synthetic visual sequences of hand trajectories. We develop a sensorized setup to acquire real human grasping sequences for benchmarking and show that, compared on practical use cases, models trained with our synthetic dataset achieve better generalization performance than models trained on real data. We finally integrate our model on the Hannes prosthetic hand and show its practical effectiveness. We make publicly available the code and dataset to reproduce the presented results
Hand-Object Interaction: From Human Demonstrations to Robot Manipulation
Human-object interaction is of great relevance for robots to operate in human environments. However, state-of-the-art robotic hands are far from replicating humans skills. It is, therefore, essential to study how humans use their hands to develop similar robotic capabilities. This article presents a deep dive into hand-object interaction and human demonstrations, highlighting the main challenges in this research area and suggesting desirable future developments. To this extent, the article presents a general definition of the hand-object interaction problem together with a concise review for each of the main subproblems involved, namely: sensing, perception, and learning. Furthermore, the article discusses the interplay between these subproblems and describes how their interaction in learning from demonstration contributes to the success of robot manipulation. In this way, the article provides a broad overview of the interdisciplinary approaches necessary for a robotic system to learn new manipulation skills by observing human behavior in the real world
Trials Supported By Smart Networks Beyond 5G: the TrialsNet Approach
TrialsNet is a project focused on improving European urban ecosystems through 13 innovative use cases in the three representative domains of Infrastructure, Transportation, Security and Safety; eHealth and Emergency; and Culture, Tourism, and Entertainment. These use cases will be implemented across different clusters in Italy, Spain, Greece, and Romania, involving real users. This paper provides an overview of the various use cases that will be trialled in different contexts through the platform and network solutions that will be deployed by the project based on advanced functionalities such as dynamic slicing management, NFV, MEC, AI/ML, and others. To this end, TrialsNet will develop assessment frameworks to measure the impact of use cases on a technical, socio-economic, and societal level through the definition and measurement of proper Key Performance Indicators (KPIs) and Key Value Indicators (KVIs). The project seeks to identify network limitations, optimize infrastructure, and define new requirements for next-generation mobile networks. Ultimately, TrialsNet aims to enhance livability in urban environments by driving advancements in various domains